Simulation of Within-Session Query Variations Using a Text Segmentation Approach

نویسندگان

  • Debasis Ganguly
  • Johannes Leveling
  • Gareth J. F. Jones
چکیده

We propose a generative model for automatic query reformulations from an initial query using the underlying subtopic structure of top ranked retrieved documents. We address three types of query reformulations a) specialization; b) generalization; and c) drift. To test our model we generate the three reformulation variants starting with selected fields from the TREC-8 topics as the initial queries. We use manual judgments from multiple assessors to calculate the accuracy of the reformulated query variants and observe accuracies of 65%, 82% and 69% respectively for specialization, generalization and drift reformulations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Time-Frequency approach for EEG signal segmentation

The record of human brain neural activities, namely electroencephalogram (EEG), is generally known as a non-stationary and nonlinear signal. In many applications, it is useful to divide the EEGs into segments within which the signals can be considered stationary. Combination of empirical mode decomposition (EMD) and Hilbert transform, called Hilbert-Huang transform (HHT), is a new and powerful ...

متن کامل

Connected Component Based Word Spotting on Persian Handwritten image documents

Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...

متن کامل

University of Essex at the TREC 2012

The primary goal of our participation in the Session track is to further evaluate our anchor expansion technique proposed in the previous year [1]. In particular, we aim to test the effectiveness of this approach on a more realistic dataset collected this year. SWIRL 2012 noted that there is still a large gap between the study of users and the study of IR algorithms [2], so the session data col...

متن کامل

Semiautomatic Image Retrieval Using the High Level Semantic Labels

Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...

متن کامل

A Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling

In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011